Goto

Collaborating Authors

 non-euclidian error signal


Minkowski-r Back-Propagation: Learning in Connectionist Models with Non-Euclidian Error Signals

Neural Information Processing Systems

Many connectionist learning models are implemented using a gradient descent in a least squares error function of the output and teacher signal. For small r's a "city-block" error metric is approximated and for large r's the "maximum" or "supremum" metric is approached. An implementation of Minkowski-r back-propagation is described. Different r values may be appropriate for the reduction of the effects of outliers (noise).


Minkowski-r Back-Propagation: Learning in Connectionist Models with Non-Euclidian Error Signals

Hanson, Stephen Jose, Burr, David J.

Neural Information Processing Systems

It can be shown that neural-like networks containing a single hidden layer of nonlinear activation units can learn to do a piece-wise linear partitioning of a feature space [2]. One result of such a partitioning is a complex gradient surface on which decisions about new input stimuli will be made. The generalization, categorization and clustering propenies of the network are therefore detennined by this mapping of input stimuli to this gradient swface in the output space. This gradient swface is a function of the conditional probability distributions of the output vectors given the input feature vectors as well as a function of the error relating the teacher signal and output.


Minkowski-r Back-Propagation: Learning in Connectionist Models with Non-Euclidian Error Signals

Hanson, Stephen Jose, Burr, David J.

Neural Information Processing Systems

It can be shown that neural-like networks containing a single hidden layer of nonlinear activation units can learn to do a piece-wise linear partitioning of a feature space [2]. One result of such a partitioning is a complex gradient surface on which decisions about new input stimuli will be made. The generalization, categorization and clustering propenies of the network are therefore detennined by this mapping of input stimuli to this gradient swface in the output space. This gradient swface is a function of the conditional probability distributions of the output vectors given the input feature vectors as well as a function of the error relating the teacher signal and output.